Nikola Kolev1,2, Taylor Stock1,2, David Gao3, 4, 1, Filippo Federici3,5, Emily Hoffman1,2, Steven Schofield1,2, Max Trouton1,2, Geoff Thornton1,2
1University College London
2London Centre for Nanotechnology
3Nanolayers Research Computing LTD
4Norwegian Institute of Science and Technology
5Aalto University
Machine learning (ML) has demonstrated exceptionally high accuracies across various tasks, including image recognition, natural language processing, and gameplay. The field of material science has increasingly adopted these methods. However, a significant challenge remains: the most popular ML techniques are often supervised, requiring vast amounts of labelled data to train accurate models. This poses a difficulty when working with novel data, as is frequently the case in academia. One way to circumvent this is to use ML frameworks that require less data. In this work, we explore the utility of unsupervised learning and few-shot learning (FSL) for segmenting scanning tunnelling microscopy (STM) images to identify and classify both inherent surface defects, and adsorbates formed from surface chemical reactions. Unsupervised learning is employed to generate training data for a UNet, which identifies defect and adsorbate locations on surfaces. Subsequently, the FSL algorithm classifies these features. After initial training, FSL can classify previously unseen classes using only a few new labelled data points. FSL has been widely used in the ML community for various data types, such as images, audio, and radar, and has gained some attention in astronomy for differentiating galaxy types. However, its application in microscopy has been limited. We investigate its applicability by testing five different FSL algorithms on three different surfaces (Si(001):H, Ge(001), and TiO2(110)). This method demonstrates the potential for greater flexibility of ML algorithms in microscopy and a significant reduction in the required training data.